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DETAILED ACTION 

Remarks 

1 . Receipt of Applicant's Amendment, filed on 09/1 3/201 0, is acknowledged. The 
amendment includes the amending of claims 1 , 1 1 , 1 7, 28, and 38, and the cancellation 
of claims 8, 12, 24, 29, 34, and 39-47. 

Claim Rejections - 35 USC § 101 

2. 35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of 
matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the 
conditions and requirements of this title. 

3. Claim 11 is rejected under 35 U.S.C 101 because the claimed invention is 
directed to the non-statutory subject area of electro-magnetic signals and carrier waves. 
Claim 5 is directed towards a "computer-readable medium". Because the 
specification provides no support for a "computer-readable medium", the examiner 
considers the claimed "computer-readable medium" in the broadest reasonable 
interpretation as being directed towards the non-statutory subject matter of electronic 
data signals/carrier waves/propagation waves. 

Claims 13-16, and 50-51 are rejected for incorporating the deficiencies of 
independent claim 11. 

Claim 1 1 is directed towards a computer-readable medium. However, all of the 
elements claimed could be reasonably interpreted by an ordinary artisan as being 
software alone, and thus is directed to software per se, which is non-statutory. 

Specifically, because the specification provides no support for a "computer- 
readable medium", the examiner considers the claimed "computer-readable medium" 
in the broadest reasonable interpretation as being directed towards the non-statutory 
subject matter of electronic data signals/carrier waves/propagation waves, and is thus 
directed towards software per se. 

In order for such a software claim to be statutory, it must be claimed in 
combination with an appropriate medium and/or hardware such as a memory or 
processor to establish a statutory category of invention and enable any functionality to 
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realized. 

Claims 13-16, and 50-51 are rejected for incorporating the deficiencies of 
independent claim 11. 

Claim 17 is rejected under 35 U.S.C 101 because the claimed invention is 
directed to the non-statutory subject area of electro-magnetic signals and carrier waves. 
Claim 5 is directed towards a "computer-readable media". Because the specification 
provides no support for a "computer-readable media", the examiner considers the 
claimed "computer-readable medium" in the broadest reasonable interpretation as being 
directed towards the non-statutory subject matter of electronic data signals/carrier 
waves/propagation waves. 

Claims 18-23, 25-27, and 52-53 are rejected for incorporating the deficiencies of 
independent claim 17. 

Claim 17 is directed towards a computer-readable medium. However, all of the 
elements claimed could be reasonably interpreted by an ordinary artisan as being 
software alone, and thus is directed to software per se, which is non-statutory. 

Specifically, because the specification provides no support for a "computer- 
readable media", the examiner considers the claimed "computer-readable medium" in 
the broadest reasonable interpretation as being directed towards the non-statutory 
subject matter of electronic data signals/carrier waves/propagation waves, and is thus 
directed towards software per se. 

In order for such a software claim to be statutory, it must be claimed in 
combination with an appropriate medium and/or hardware such as a memory or 
processor to establish a statutory category of invention and enable any functionality to 
realized. 

Claims 18-23, 25-27, and 52-53 are rejected for incorporating the deficiencies of 
independent claim 17. 

Claim Objections 

4. Claims 1 3-1 6 objected to because of the following informalities: Claims 1 3-1 6 
and 50-51 are directed towards a graphical user interface even though parent 
independent claim 11 is directed towards a computer-readable medium. Claims 13-16 
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and 50-51 should be amended to be directed towards a computer-readable medium. 
Appropriate correction is required. 

Claim 57 objected to because of the following informalities: Claim 57 is directed 
towards a computer even though parent dependent claim 56 and parent independent 
claim 38 are each directed towards a method. Claim 56 should be amended to be 
directed towards a method. Appropriate correction is required. 

Claim Rejections - 35 USC § 103 

5. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

6. Claims 1 -7, 9-11,1 3-23, 25-28, 30-33, 35-38, 48, 52, 54, 56, and 58 are rejected 
under 35 U.S.C. 103(a) as being unpatentable over Bellegarda et al. (Article entitled 
"Exploiting Latent Semantic Information in Statistical Language Modeling", dated 
10/26/2000) and in view of Vivisimo (Article entitled "Vivisimo FAQ", dated 02/04/2002), 
and further in view of Moore et al. (U.S. PGPUB 2004/01 93621 ). 

7. Regarding claim 1 , Bellegarda teaches a method comprising: 

A) mapping the files in the file system into a semantic vector space (Page 1279, 
Abstract); 

B) clustering the files within said space (Pages 1279 and 1291, Abstract). 

C) wherein multiple threshold values that are settable to desired levels of granularity 
are defined, and said files are clustered based on said multiple threshold values (Page 
1284) 

The examiner notes that Bellegarda teaches "mapping the files in the file 
system into a semantic vector space" as "(discrete) words and documents are 
mapped onto a (continuous) semantic vector space, in which familiar clustering 
techniques can be applied. This leads to the specification of a powerful framework for 
automatic semantic classification, as well as the derivation of several language model 
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families with various smoothing properties" (Page 1279, Abstract) and "The general 
domain considered was business news, as reflected in the WSJ portion of the NAB 
corpus" (Page 1291). The examiner further notes that Bellegarda teaches "clustering 
the files within said space" as "(discrete) words and documents are mapped onto a 
(continuous) semantic vector space, in which familiar clustering techniques can be 
applied. This leads to the specification of a powerful framework for automatic semantic 
classification, as well as the derivation of several language model families with various 
smoothing properties" (Page 1279, Abstract). The examiner further notes that 
Bellegarda teaches "wherein multiple threshold values that are settable to desired 
levels of granularity are defined, and said files are clustered based on said 
multiple threshold values" as "Once (11) is specified, it is straightforward to proceed 
with the clustering of the word vectors , using any of a variety of algorithms (see, for 
instance, [2]). Since the number of such vectors is relatively large, it is advisable to 
perform this clustering in stages, using, for example, K-means and bottom-up clustering 
sequentially. In that case, K-means clustering is used to obtain a coarse partition of the 
vocabulary in to a small set of superclusters. Each supercluster is then itself partitioned 
using bottom-up clustering, resulting in a final set of clusters Ck, 1<=k<=K, . This 
process can be thought of as uncovering, in a data-driven fashion, a particular layer of 
semantic knowledge in the space" (1284). 
Bellegarda does not explicitly teach: 

D) deriving a hierarchy of plural level of clusters from said clustering; 

E) providing a user an option of displaying the files in a hierarchical format of plural 
level of clusters based on said derived hierarchy. 

Vivisomo, however, teaches "deriving a hierarchy of plural level of clusters 
from said clustering" as "Document clustering is the automatic organization of 
documents into groups or clusters. "Document clustering" differs from other techniques 
(classification, taxonomy building, Northern Light, etc.) in that it is fully automated: there 
is no human intervention at any point (except that people wrote the basic algorithms). 
The biggest challenge for document clustering has been to quickly find meaningful 
groups that are concisely annotated. Our innovation relies on a newly discovered 
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heuristic algorithm that does this well. Our clustering algorithm has achieved good 
results on web pages, patent abstracts, newswires, meeting transcripts, and television 
transcripts with little or no customization in every case" (Page 03), "Instead of producing 
a flat list of groups, Vivisimo organizes groups into a hierarchy or tree, using a well- 
known "Windows Explorer"-style interface. This interface can be used with no training 
since it is quite intuitive. Users can zoom in on items of interest while keeping an 
overview of all the search results" (Page 03), "No. Simple one-word queries often lead 
to clusters that modify the query. For example, "soap" can lead to "soap opera", 
"handmade soap", and "soap bubbles", but also to "simple object access protocol", 
known also by its SOAP acronym" (Page 04), and "No. Sometimes a document fits well 
in more than one place in the hierarchy, so we place it everywhere it fits. For users, this 
is better than forcing documents to fit in a single location" (Page 04), and " providing a 
user an option of displaying the files in a hierarchical format of plural level of 
clusters based on said derived hierarchy" as "Document clustering is the automatic 
organization of documents into groups or clusters. "Document clustering" differs from 
other techniques (classification, taxonomy building, Northern Light, etc.) in that it is fully 
automated: there is no human intervention at any point (except that people wrote the 
basic algorithms). The biggest challenge for document clustering has been to quickly 
find meaningful groups that are concisely annotated. Our innovation relies on a newly 
discovered heuristic algorithm that does this well. Our clustering algorithm has achieved 
good results on web pages, patent abstracts, newswires, meeting transcripts, and 
television transcripts with little or no customization in every case" (Page 03), "Instead of 
producing a flat list of groups, Vivisimo organizes groups into a hierarchy or tree, using 
a well-known "Windows Explorer"-style interface. This interface can be used with no 
training since it is quite intuitive. Users can zoom in on items of interest while keeping 
an overview of all the search results" (Page 03), "No. Simple one-word queries often 
lead to clusters that modify the query. For example, "soap" can lead to "soap opera", 
"handmade soap", and "soap bubbles", but also to "simple object access protocol", 
known also by its SOAP acronym" (Page 04), and "No. Sometimes a document fits well 



Application/Control Number: 10/644,815 Page 7 

Art Unit: 2168 

in more than one place in the hierarchy, so we place it everywhere it fits. For users, this 
is better than forcing documents to fit in a single location" (Page 04). 

The examiner further notes that the non-applied art of Arnold shows an interface 
of the Vivisimo search engine. Specifically, there is shown clusters of hierarchical 
folders that allow a user to drill down further if need be. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Bellegarda and Vivisimo do not explicitly teach: 
E) providing a user an option of displaying the files in a hierarchical format based on 
locations of the files in the file system . 

Moore, however, teaches " providing a user an option of displaying the files 
in a hierarchical format based on locations of the files in the file system " as "FIG. 
5 is a tree diagram of a folder structure in accordance with a physical folder 
arrangement on a hard drive. This physical folder arrangement is based on the 
traditional implementation of folders, which may be based on NTFS or other existing file 
systems. Such folders are referred to as physical folders because their structuring is 
based on the actual physical underlying file system structure on the disk. As will be 
described in more detail below, this is in contrast to virtual folders, which create 
location-independent views that allow users to manipulate files and folders in ways that 
are similar to those currently used for manipulating physical folders" (Paragraph 95) and 
"FIG. 1 7 is a diagram illustrative of a screen display in which a quick link for physical 
folders is selected. The selection box SB is shown to be around the "all folders" quick 
link 616. As will be described in more detail below with respect to FIG. 18, the "all 
folders" quick link 616 provides for switching to a view of physical folders. FIG. 1 8 is a 
diagram illustrative of a screen display showing physical folders. The physical folders 
that are shown contain the files of the virtual folder stacks of FIG. 1 7. In other words, the 
items contained within the stacks 651-655 of FIG. 17 are also contained in certain 
physical folders in the system. These are shown in FIG. 18 as a "My Documents" folder 
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851 that is located on the present computer, a "Desktop" folder 852 that is located on 
the present computer, a "Foo" folder 853 that is located on the hard drive C:, a "My 
Files" folder 854 that is located on a server, an "External Drive" folder 855 that is 
located on an external drive, a "My Documents" folder 856 that is located on another 
computer, and a "Desktop" folder 857 that is located on another computer" (Paragraphs 
115-116). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Moore's would have allowed Bellegarda's and Vivisimo's to provide users the ability 
to toggle between virtual folder representations and physical folder representations 
based on their desires, as noted by Moore (Paragraph 117). 

Regarding claim 2, Bellegarda does not explicitly teach a method comprising: 
A) wherein the step of clustering the files is performed as a background routine during 
the operation of a computer associated with said file system. 

Vivisimo, however, teaches "wherein the step of clustering the files is 
performed as a background routine during the operation of a computer 
associated with said file system" as "Clustering is done just before the user sees the 
search results, just in time. There is no need to prepare anything beforehand, much less 
pre-process the entire document collection from where the results came. Clustering is a 
fully automatic process that requires no preparation steps, and hence no maintenance. 
Classification requires pre-specifying categories (typically broad and hence rather 
bland) and updating these categories as new documents are added to the collection" 
(Page 03). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Regarding claim 3, Bellegarda further teaches a method comprising: 
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A) wherein the step of clustering the files is performed in response to the creation of a 
new file within the file system (Page 1286, Section: A. Framework Extension). 

The examiner notes that Bellegarda teaches "wherein the step of clustering 
the files is performed in response to the creation of a new file within the file 
system" as "finding a new representation for a new document in the space S is 
straightforward" (Page 1286, Section: A. Framework Extension). The examiner further 
notes that it is clear that the method of Bellegarda clusters when a new document is 
noticed. 

Regarding claim 4, Bellegarda further teaches a method comprising: 

A) wherein said files are text documents (Page 1279, Abstract); and 

B) said mapping is conducted on the basis of a language model (Page 1279, Abstract). 

The examiner notes that Bellegarda teaches "wherein said files are text 
documents" as "This paper focuses on the use of latent semantic analysis, a paradigm 
that automatically uncovers the salient semantic relationships between words and 
documents in a given corpus" (Page 1279, Abstract). The examiner further notes that 
Bellegarda teaches "said mapping is conducted on the basis of a language 
model" as "(discrete) words and documents are mapped onto a (continuous) semantic 
vector space, in which familiar clustering techniques can be applied. This leads to the 
specification of a powerful framework for automatic semantic classification, as well as 
the derivation of several language model families with various smoothing properties" 
(Page 1279, Abstract). 

Regarding claim 5, Bellegarda further teaches a method comprising: 

A) wherein said mapping step comprises the steps of constructing a matrix which 
associates each word in the documents with a vector (Page 1281 , Section: A. Feature 
Extraction, Section: B. Singular Value Decomposition); and 

B) associates each document with a vector (Page 1281 , Section: A. Feature 
Extraction, Section: B. Singular Value Decomposition). 



Application/Control Number: 1 0/644,81 5 Page 1 0 

Art Unit: 2168 

The examiner notes that Bellegarda teaches "wherein said mapping step 
comprises the steps of constructing a matrix which associates each word in the 
documents with a vector" as "The starting point is the construction of a matrix (W) of 
co-occurrences between words and documents" (Page 1281, Section: A. Feature 
Extraction) and "The (Mx N) word-document matrix 14/ resulting from the above feature 
extraction defines two vector representations for the words and the documents. Each 
word co/ can be uniquely associated with a row vector of dimension N, and each 
document c/ 7 can be uniquely associated with a column vector of dimension M (Page 
1281, Section: B. Singular Value Decomposition). The examiner further notes that 
Bellegarda teaches "associates each document with a vector" as "The (Mx N) 
word-document matrix 14/ resulting from the above feature extraction defines two vector 
representations for the words and the documents. Each word co/ can be uniquely 
associated with a row vector of dimension N, and each document dj can be uniquely 
associated with a column vector of dimension M" (Page 1281, Section: B. Singular 
Value Decomposition). 

Regarding claim 6, Bellegarda further teaches a method comprising: 
A) the step of decomposing said matrix to define the words and documents as vectors 
in a continuous vector space (Page 1281 , Section: A. Feature Extraction, Section: B. 
Singular Value Decomposition). 

The examiner notes that Bellegarda teaches "the step of decomposing said 
matrix to define the words and documents as vectors in a continuous vector 
space" as "To address these issues, it is useful to employ a singular value 
decomposition (SVD), a technique closely related to eigenvector decomposition and 
factor analysis" (Page 1281, Section: B. Singular Value Decomposition). 

Regarding claim 7, Bellegarda further teaches a method comprising: 
A) wherein said clustering is performed by identifying documents whose vectors are 
within a threshold distance of one another (Page 1284, Section: A. Word Clustering). 
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The examiner notes that Bellegarda teaches "wherein said clustering is 
performed by identifying documents whose vectors are within a threshold 
distance of one another" as "This opens up the opportunity to apply familiar clustering 
techniques in S, as long as a distance measure consistent with the SVD formalism is 
defined on the vector space" (Page 1286, Section: A. Framework Extension). 

Regarding claim 9, Bellegarda does not explicitly teach a method comprising: 
A) including the step of automatically labeling the clusters based on the resulting 
clusters. 

Vivisimo, however teaches "including the step of automatically labeling the 
clusters based on the resulting clusters" as "Document clustering is the automatic 
organization of documents into groups or clusters. "Document clustering" differs from 
other techniques (classification, taxonomy building, Northern Light, etc.) in that it is fully 
automated: there is no human intervention at any point (except that people wrote the 
basic algorithms). The biggest challenge for document clustering has been to quickly 
find meaningful groups that are concisely annotated. Our innovation relies on a newly 
discovered heuristic algorithm that does this well. Our clustering algorithm has achieved 
good results on web pages, patent abstracts, newswires, meeting transcripts, and 
television transcripts with little or no customization in every case" (Page 03) and 
"Instead of producing a flat list of groups, Vivisimo organizes groups into a hierarchy or 
tree, using a well-known "Windows Explorer"-style interface. This interface can be used 
with no training since it is quite intuitive. Users can zoom in on items of interest while 
keeping an overview of all the search results" (Page 03), "Conceptual clustering 
methods interleave the process of forming groups with the step of annotating them, 
much like people might do by hand. So, if Vivisimo tries to form a group but judges that 
the group cannot be described well, the group is rejected. In contrast, some other 
approaches rely mainly on mathematical optimization, in which description of the groups 
is relegated to the end after the groups are formed, which gives generally worse results" 
(Page 03), and "We are gratified that users sometimes ask this. The annotations are 
created spontaneously by the software. When they are good, it seems that a human 
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being must have created the categories and the machine merely recognizes the 
documents that belong there, which is not the case. However, our technology is not 
perfect: the diligent user will surely spot an occasional annotation that only a machine 
would make up" (Page 04) 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Regarding claim 10, Bellegarda does not explicitly teach a method comprising: 
A) wherein said labeling comprises selecting representative words based on the 
closeness of their vectors to the document vectors in a cluster. 

Vivisimo, however teaches "wherein said labeling comprises selecting 
representative words based on the closeness of their vectors to the document 
vectors in a cluster" as "Document clustering is the automatic organization of 
documents into groups or clusters. "Document clustering" differs from other techniques 
(classification, taxonomy building, Northern Light, etc.) in that it is fully automated: there 
is no human intervention at any point (except that people wrote the basic algorithms). 
The biggest challenge for document clustering has been to quickly find meaningful 
groups that are concisely annotated. Our innovation relies on a newly discovered 
heuristic algorithm that does this well. Our clustering algorithm has achieved good 
results on web pages, patent abstracts, newswires, meeting transcripts, and television 
transcripts with little or no customization in every case" (Page 03) and "Instead of 
producing a flat list of groups, Vivisimo organizes groups into a hierarchy or tree, using 
a well-known "Windows Explorer"-style interface. This interface can be used with no 
training since it is quite intuitive. Users can zoom in on items of interest while keeping 
an overview of all the search results" (Page 03), "Conceptual clustering methods 
interleave the process of forming groups with the step of annotating them, much like 
people might do by hand. So, if Vivisimo tries to form a group but judges that the group 
cannot be described well, the group is rejected. In contrast, some other approaches rely 
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mainly on mathematical optimization, in which description of the groups is relegated to 
the end after the groups are formed, which gives generally worse results" (Page 03), 
and "We are gratified that users sometimes ask this. The annotations are created 
spontaneously by the software. When they are good, it seems that a human being must 
have created the categories and the machine merely recognizes the documents that 
belong there, which is not the case. However, our technology is not perfect: the diligent 
user will surely spot an occasional annotation that only a machine would make up" 
(Page 04). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Regarding claim 1 1 , Bellegarda teaches a graphical user interface comprising: 

A) a virtual file system (Page 1279, Abstract); and 

B) clustering said files based on multiple threshold values that are settable to desired 
levels of granularity (Page 1284). 

The examiner notes that Bellegarda teaches "a virtual file system with a 
semantic hierarchy, wherein the semantic hierarchy is based on clustering of files 
based on semantic similarities" as "(discrete) words and documents are mapped 
onto a (continuous) semantic vector space, in which familiar clustering techniques can 
be applied. This leads to the specification of a powerful framework for automatic 
semantic classification, as well as the derivation of several language model families with 
various smoothing properties" (Page 1279, Abstract). The examiner further notes that 
Bellegarda teaches "clustering said files based on multiple threshold values that 
are settable to desired levels of granularity" as "Once (1 1 ) is specified, it is 
straightforward to proceed with the clustering of the word vectors , using any of a variety 
of algorithms (see, for instance, [2]). Since the number of such vectors is relatively 
large, it is advisable to perform this clustering in stages, using, for example, K-means 
and bottom-up clustering sequentially. In that case, K-means clustering is used to obtain 
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a coarse partition of the vocabulary in to a small set of superclusters. Each supercluster 
is then itself partitioned using bottom-up clustering, resulting in a final set of clusters Ck, 
1<=k<=K, . This process can be thought of as uncovering, in a data-driven fashion, a 
particular layer of semantic knowledge in the space" (1284). 

Bellegarda does not explicitly teach: 
A) A graphical user interface configured to display files with a semantic hierarchy of 
plural levels of clusters that is derived from semantic similarities of said files; 
C) determining a directory structure having plural levels of clusters based on the 
clustering determined from similarities between said files, wherein the graphical user 
interface provides a user an option of graphically displaying the determined directory 
structure having plural levels of clusters to be displayed on a display device. 

Vivisimo, however, teaches "A graphical user interface configured to display 
files with a semantic hierarchy of plural levels of clusters that is derived from 
semantic similarities of said files" as "Document clustering is the automatic 
organization of documents into groups or clusters. "Document clustering" differs from 
other techniques (classification, taxonomy building, Northern Light, etc.) in that it is fully 
automated: there is no human intervention at any point (except that people wrote the 
basic algorithms). The biggest challenge for document clustering has been to quickly 
find meaningful groups that are concisely annotated. Our innovation relies on a newly 
discovered heuristic algorithm that does this well. Our clustering algorithm has achieved 
good results on web pages, patent abstracts, newswires, meeting transcripts, and 
television transcripts with little or no customization in every case" (Page 03), "Instead of 
producing a flat list of groups, Vivisimo organizes groups into a hierarchy or tree, using 
a well-known "Windows Explorer"-style interface. This interface can be used with no 
training since it is quite intuitive. Users can zoom in on items of interest while keeping 
an overview of all the search results" (Page 03), "No. Simple one-word queries often 
lead to clusters that modify the query. For example, "soap" can lead to "soap opera", 
"handmade soap", and "soap bubbles", but also to "simple object access protocol", 
known also by its SOAP acronym" (Page 04), and "No. Sometimes a document fits well 
in more than one place in the hierarchy, so we place it everywhere it fits. For users, this 
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is better than forcing documents to fit in a single location" (Page 04), and "determining 
a directory structure having plural levels of clusters based on the clustering 
determined from similarities between said files, wherein the graphical user 
interface provides a user an option of graphically displaying the determined 
directory structure having plural levels of clusters to be displayed on a display 
device" as "Document clustering is the automatic organization of documents into 
groups or clusters. "Document clustering" differs from other techniques (classification, 
taxonomy building, Northern Light, etc.) in that it is fully automated: there is no human 
intervention at any point (except that people wrote the basic algorithms). The biggest 
challenge for document clustering has been to quickly find meaningful groups that are 
concisely annotated. Our innovation relies on a newly discovered heuristic algorithm 
that does this well. Our clustering algorithm has achieved good results on web pages, 
patent abstracts, newswires, meeting transcripts, and television transcripts with little or 
no customization in every case" (Page 03), "Instead of producing a flat list of groups, 
Vivisimo organizes groups into a hierarchy or tree, using a well-known "Windows 
Explorer"-style interface. This interface can be used with no training since it is quite 
intuitive. Users can zoom in on items of interest while keeping an overview of all the 
search results" (Page 03), "No. Simple one-word queries often lead to clusters that 
modify the query. For example, "soap" can lead to "soap opera", "handmade soap", and 
"soap bubbles", but also to "simple object access protocol", known also by its SOAP 
acronym" (Page 04), and "No. Sometimes a document fits well in more than one place 
in the hierarchy, so we place it everywhere it fits. For users, this is better than forcing 
documents to fit in a single location" (Page 04). 

The examiner further notes that the non-applied art of Arnold shows an interface 
of the Vivisimo search engine. Specifically, there is shown clusters of hierarchical 
folders that allow a user to drill down further if need be. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 
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Bellegarda and Vivisimo do not explicitly teach: 
C) wherein the graphical user interface provides a user an option of graphically 
displaying the files in a hierarchical format based on locations of the files in the virtual 
file system . 

Moore, however, teaches "wherein the graphical user interface provides a 
user an option of graphically displaying the files in a hierarchical format based on 
locations of the files in the virtual file system " as "FIG. 5 is a tree diagram of a 
folder structure in accordance with a physical folder arrangement on a hard drive. This 
physical folder arrangement is based on the traditional implementation of folders, which 
may be based on NTFS or other existing file systems. Such folders are referred to as 
physical folders because their structuring is based on the actual physical underlying file 
system structure on the disk. As will be described in more detail below, this is in 
contrast to virtual folders, which create location-independent views that allow users to 
manipulate files and folders in ways that are similar to those currently used for 
manipulating physical folders" (Paragraph 95) and "FIG. 17 is a diagram illustrative of a 
screen display in which a quick link for physical folders is selected. The selection box 
SB is shown to be around the "all folders" quick link 616. As will be described in more 
detail below with respect to FIG. 18, the "all folders" quick link 616 provides for 
switching to a view of physical folders. FIG. 18 is a diagram illustrative of a screen 
display showing physical folders. The physical folders that are shown contain the files of 
the virtual folder stacks of FIG. 17. In other words, the items contained within the stacks 
651-655 of FIG. 17 are also contained in certain physical folders in the system. These 
are shown in FIG. 18 as a "My Documents" folder 851 that is located on the present 
computer, a "Desktop" folder 852 that is located on the present computer, a "Foo" folder 
853 that is located on the hard drive C:, a "My Files" folder 854 that is located on a 
server, an "External Drive" folder 855 that is located on an external drive, a "My 
Documents" folder 856 that is located on another computer, and a "Desktop" folder 857 
that is located on another computer" (Paragraphs 115-116). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
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Moore's would have allowed Bellegarda's and Vivisimo's to provide users the ability 
to toggle between virtual folder representations and physical folder representations 
based on their desires, as noted by Moore (Paragraph 117). 

Regarding claim 13, Bellegarda does not explicitly teach a graphical user 
interface comprising: 

A) wherein clustering of the files is initiated by user selection. 

Vivisimo, however, teaches "wherein clustering of the files is initiated by 
user selection" as "Clustering is done just before the user sees the search results, just 
in time. There is no need to prepare anything beforehand, much less pre-process the 
entire document collection from where the results came. Clustering is a fully automatic 
process that requires no preparation steps, and hence no maintenance. Classification 
requires pre-specifying categories (typically broad and hence rather bland) and updating 
these categories as new documents are added to the collection" (Page 03). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Regarding claim 14, Bellegarda further teaches a graphical user interface 
comprising: 

A) wherein clustering of the files is initiated upon creation of a new file in the file system 
(Page 1286, Section: A. Framework Extension). 

The examiner notes that Bellegarda teaches "wherein clustering of the files is 
initiated upon creation of a new file in the file system" as "finding a new 
representation for a new document in the space S is straightforward" (Page 1286, 
Section: A. Framework Extension). The examiner further notes that it is clear that the 
method of Bellegarda clusters when a new document is noticed. 
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Regarding claim 15, Bellegarda further teaches a graphical user interface 
comprising: 

A) wherein text files are clustered utilizing a language model (Page 1279, Abstract). 

The examiner notes that Bellegarda teaches "analyzing files in a file system 
to determine similarities in data pertaining to their content" as "(discrete) words 
and documents are mapped onto a (continuous) semantic vector space, in which 
familiar clustering techniques can be applied. This leads to the specification of a 
powerful framework for automatic semantic classification, as well as the derivation of 
several language model families with various smoothing properties" (Page 1279, 
Abstract). 

Bellegarda does not explicitly teach: 

B) non-text files are clustered utilizing rule-based techniques. 

Oliver, however, teaches "non-text files are clustered utilizing rule-based 
techniques" as "Vivisimo now also supports the most advanced features of the major 
search engines using one Vivisimo syntax, which follows the most standard 
conventions. Vivisimo translates your query into the corresponding syntax of each 
underlying search engine. Vivisimo only queries the search engines that support your 
chosen syntax. (Check which engines have been queried by clicking on the Details link 
at the top of the results page.) Thus, you can safely use +,-,... or common Boolean 
operators (NEAR,OR,...) as well as common field searches such as image:, title:, 
link:... or search restrictions such as host: or domain:" (Page 08). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Regarding claim 16, Bellegarda further teaches a graphical user interface 
comprising: 

A) wherein said language model comprises the LSA paradigm (Page 1281, Section: D. 
Organization). 
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The examiner notes that Bellegarda teaches "wherein said language model 
comprises the LSA paradigm" as "The focus of this paper is on semantically driven 
span extension only, and more specifically on how the LSA paradigm can be exploited 
to improve statistical language modeling" (Page 1281, Section: D. Organization). 

Regarding claim 17, Bellegarda teaches a computer-readable media comprising: 

A) analyzing files in a file system to determine similarities in data pertaining to their 
content (Page 1279, Abstract); 

B) clustering said files based on multiple threshold values that are settable to desired 
levels of granularity (1284); 

The examiner notes that Bellegarda teaches "analyzing files in a file system 
to determine similarities in data pertaining to their content" as "(discrete) words 
and documents are mapped onto a (continuous) semantic vector space, in which 
familiar clustering techniques can be applied. This leads to the specification of a 
powerful framework for automatic semantic classification, as well as the derivation of 
several language model families with various smoothing properties" (Page 1279, 
Abstract). The examiner further notes that Bellegarda teaches "clustering said files 
based on multiple threshold values that are settable to desired levels of 
granularity" as "Once (1 1 ) is specified, it is straightforward to proceed with the 
clustering of the word vectors , using any of a variety of algorithms (see, for instance, 
[2]). Since the number of such vectors is relatively large, it is advisable to perform this 
clustering in stages, using, for example, K-means and bottom-up clustering sequentially. 
In that case, K-means clustering is used to obtain a coarse partition of the vocabulary in 
to a small set of superclusters. Each supercluster is then itself partitioned using bottom- 
up clustering, resulting in a final set of clusters Ck, 1<=k<=K, . This process can be 
thought of as uncovering, in a data-driven fashion, a particular layer of semantic 
knowledge in the space" (1284). 

Bellegarda does not explicitly teach: 

C) determining a directory structure having plural levels of clusters based on the 
clustering determined from similarities between the files; 
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D) providing a user an option of displaying files in hierarchical format of plural levels of 
clusters based on the clustering determined from similarities between the files. 

Vivisimo, however, teaches "determining a directory structure having plural 
levels of clusters based on the clustering determined from similarities between 
the files" as "Document clustering is the automatic organization of documents into 
groups or clusters. "Document clustering" differs from other techniques (classification, 
taxonomy building, Northern Light, etc.) in that it is fully automated: there is no human 
intervention at any point (except that people wrote the basic algorithms). The biggest 
challenge for document clustering has been to quickly find meaningful groups that are 
concisely annotated. Our innovation relies on a newly discovered heuristic algorithm 
that does this well. Our clustering algorithm has achieved good results on web pages, 
patent abstracts, newswires, meeting transcripts, and television transcripts with little or 
no customization in every case" (Page 03), "Instead of producing a flat list of groups, 
Vivisimo organizes groups into a hierarchy or tree, using a well-known "Windows 
Explorer"-style interface. This interface can be used with no training since it is quite 
intuitive. Users can zoom in on items of interest while keeping an overview of all the 
search results" (Page 03), "No. Simple one-word queries often lead to clusters that 
modify the query. For example, "soap" can lead to "soap opera", "handmade soap", and 
"soap bubbles", but also to "simple object access protocol", known also by its SOAP 
acronym" (Page 04), and "No. Sometimes a document fits well in more than one place 
in the hierarchy, so we place it everywhere it fits. For users, this is better than forcing 
documents to fit in a single location" (Page 04), and " providing a user an option of 
displaying files in hierarchical format of plural levels of clusters based on the 
clustering determined from similarities between the files" as "Document clustering 
is the automatic organization of documents into groups or clusters. "Document 
clustering" differs from other techniques (classification, taxonomy building, Northern 
Light, etc.) in that it is fully automated: there is no human intervention at any point 
(except that people wrote the basic algorithms). The biggest challenge for document 
clustering has been to quickly find meaningful groups that are concisely annotated. Our 
innovation relies on a newly discovered heuristic algorithm that does this well. Our 
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clustering algorithm has achieved good results on web pages, patent abstracts, 
newswires, meeting transcripts, and television transcripts with little or no customization 
in every case" (Page 03), "Instead of producing a flat list of groups, Vivisimo organizes 
groups into a hierarchy or tree, using a well-known "Windows Explorer"-style interface. 
This interface can be used with no training since it is quite intuitive. Users can zoom in 
on items of interest while keeping an overview of all the search results" (Page 03), "No. 
Simple one-word queries often lead to clusters that modify the query. For example, 
"soap" can lead to "soap opera", "handmade soap", and "soap bubbles", but also to 
"simple object access protocol", known also by its SOAP acronym" (Page 04), and "No. 
Sometimes a document fits well in more than one place in the hierarchy, so we place it 
everywhere it fits. For users, this is better than forcing documents to fit in a single 
location" (Page 04). 

The examiner further notes that the non-applied art of Arnold shows an interface 
of the Vivisimo search engine. Specifically, there is shown clusters of hierarchical 
folders that allow a user to drill down further if need be. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Bellegarda and Vivisimo do not explicitly teach: 
D) providing a user an option of displaying the files in a hierarchical format based on 
location of the files in the file system . 

Moore, however, teaches " providing a user an option of displaying the files 
in a hierarchical format based on location of the files in the file system " as "FIG. 5 
is a tree diagram of a folder structure in accordance with a physical folder arrangement 
on a hard drive. This physical folder arrangement is based on the traditional 
implementation of folders, which may be based on NTFS or other existing file systems. 
Such folders are referred to as physical folders because their structuring is based on the 
actual physical underlying file system structure on the disk. As will be described in more 
detail below, this is in contrast to virtual folders, which create location-independent 
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views that allow users to manipulate files and folders in ways that are similar to those 
currently used for manipulating physical folders" (Paragraph 95) and "FIG. 17 is a 
diagram illustrative of a screen display in which a quick link for physical folders is 
selected. The selection box SB is shown to be around the "all folders" quick link 616. As 
will be described in more detail below with respect to FIG. 18, the "all folders" quick link 
616 provides for switching to a view of physical folders. FIG. 18 is a diagram illustrative 
of a screen display showing physical folders. The physical folders that are shown 
contain the files of the virtual folder stacks of FIG. 1 7. In other words, the items 
contained within the stacks 651-655 of FIG. 17 are also contained in certain physical 
folders in the system. These are shown in FIG. 18 as a "My Documents" folder 851 that 
is located on the present computer, a "Desktop" folder 852 that is located on the present 
computer, a "Foo" folder 853 that is located on the hard drive C:, a "My Files" folder 854 
that is located on a server, an "External Drive" folder 855 that is located on an external 
drive, a "My Documents" folder 856 that is located on another computer, and a 
"Desktop" folder 857 that is located on another computer" (Paragraphs 115-116). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Moore's would have allowed Bellegarda's and Vivisimo's to provide users the ability 
to toggle between virtual folder representations and physical folder representations 
based on their desires, as noted by Moore (Paragraph 117). 

Regarding claim 18, Bellegarda further teaches a computer-readable media 
comprising: 

A) wherein said files are text documents (Page 1279, Abstract); and 

B) the similarities are based upon the word content of the files (Page 1 281 , Section: A. 
Feature Extraction, Section: B. Singular Value Decomposition). 

The examiner notes that Bellegarda teaches "wherein said files are text 
documents" as "This paper focuses on the use of latent semantic analysis, a paradigm 
that automatically uncovers the salient semantic relationships between words and 
documents in a given corpus" (Page 1279, Abstract). The examiner further notes that 
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Bellegarda teaches "the similarities are based upon the word content of the files" 

as "The starting point is the construction of a matrix (W) of co-occurrences between 
words and documents" (Page 1281, Section: A. Feature Extraction) and "The (Mx N) 
word-document matrix W resulting from the above feature extraction defines two vector 
representations for the words and the documents. Each word co/ can be uniquely 
associated with a row vector of dimension N, and each document dj can be uniquely 
associated with a column vector of dimension M (Page 1281, Section: B. Singular 
Value Decomposition). 

Regarding claim 19, Bellegarda further teaches a computer-readable media 
comprising: 

A) wherein said similarities are determined in accordance with a language model (Page 
1279, Abstract, Page 1281 , Section: D. Organization); and 

B) the files are clustered in accordance with said model (Page 1279, Abstract, Page 
1281 , Section: D. Organization). 

The examiner notes that Bellegarda teaches "wherein said similarities are 
determined in accordance with a language model" as "(discrete) words and 
documents are mapped onto a (continuous) semantic vector space, in which familiar 
clustering techniques can be applied. This leads to the specification of a powerful 
framework for automatic semantic classification, as well as the derivation of several 
language model families with various smoothing properties" (Page 1279, Abstract). The 
examiner further notes that Bellegarda teaches "the files are clustered in 
accordance with said model" as "(discrete) words and documents are mapped onto a 
(continuous) semantic vector space, in which familiar clustering techniques can be 
applied. This leads to the specification of a powerful framework for automatic semantic 
classification, as well as the derivation of several language model families with various 
smoothing properties" (Page 1279, Abstract). 

Regarding claim 20, Bellegarda further teaches a computer-readable media 
comprising: 
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A) wherein said language model comprises the LSA paradigm (Page 1281, Section: D. 
Organization). 

The examiner notes that Bellegarda teaches "wherein said language model 
comprises the LSA paradigm" as "The focus of this paper is on semantically driven 
span extension only, and more specifically on how the LSA paradigm can be exploited 
to improve statistical language modeling" (Page 1281, Section: D. Organization). 

Regarding claim 21, Bellegarda further teaches a computer-readable media 
comprising: 

A) wherein said computer-executable code performs the steps of constructing a matrix 
which associates each word in the documents with a vector (Page 1281, Section: A. 
Feature Extraction, Section: B. Singular Value Decomposition); and 

B) associates each document with a vector (Page 1281, Section: A. Feature 
Extraction, Section: B. Singular Value Decomposition). 

The examiner notes that Bellegarda teaches "wherein said computer- 
executable code performs the steps of constructing a matrix which associates 
each word in the documents with a vector" as "The starting point is the construction 
of a matrix (W) of co-occurrences between words and documents" (Page 1281 , Section: 
A. Feature Extraction) and "The (Mx N) word-document matrix W resulting from the 
above feature extraction defines two vector representations for the words and the 
documents. Each word co/ can be uniquely associated with a row vector of dimension N, 
and each document c/ 7 can be uniquely associated with a column vector of dimension M 
(Page 1281 , Section: B. Singular Value Decomposition). The examiner further notes 
that Bellegarda teaches "associates each document with a vector" as "The (Mx N) 
word-document matrix W resulting from the above feature extraction defines two vector 
representations for the words and the documents. Each word co/ can be uniquely 
associated with a row vector of dimension N, and each document c/, can be uniquely 
associated with a column vector of dimension M" (Page 1281, Section: B. Singular 
Value Decomposition). 
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Regarding claim 22, Bellegarda further teaches a computer-readable media 
comprising: 

A) wherein said computer-executable code further performs step of decomposing said 
matrix to define the words and documents as vectors in a continuous vector space 
(Page 1281, Section: A. Feature Extraction, Section: B. Singular Value 
Decomposition). 

The examiner notes that Bellegarda teaches "wherein said computer- 
executable code further performs step of decomposing said matrix to define the 
words and documents as vectors in a continuous vector space" as "To address 
these issues, it is useful to employ a singular value decomposition (SVD), a technique 
closely related to eigenvector decomposition and factor analysis" (Page 1281 , Section: 
B. Singular Value Decomposition). 

Regarding claim 23, Bellegarda further teaches a computer-readable media 
comprising: 

A) wherein said computer-executable code performs clustering by identifying 
documents whose vectors are within a threshold distance of one another (Page 1284, 
Section: A. Word Clustering). 

The examiner notes that Bellegarda teaches "wherein said computer- 
executable code performs clustering by identifying documents whose vectors are 
within a threshold distance of one another" as "This opens up the opportunity to 
apply familiar clustering techniques in S, as long as a distance measure consistent with 
the SVD formalism is defined on the vector space" (Page 1286, Section: A. Framework 
Extension). 

Regarding claim 25, Bellegarda does not explicitly teach a computer-readable 
media comprising: 

A) wherein said computer-executable code performs step of automatically labeling the 
clusters based on the resulting clusters. 
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Vivisimo, however teaches "wherein said computer-executable code 
performs step of automatically labeling the clusters based on the resulting 
clusters" as "Document clustering is the automatic organization of documents into 
groups or clusters. "Document clustering" differs from other techniques (classification, 
taxonomy building, Northern Light, etc.) in that it is fully automated: there is no human 
intervention at any point (except that people wrote the basic algorithms). The biggest 
challenge for document clustering has been to quickly find meaningful groups that are 
concisely annotated. Our innovation relies on a newly discovered heuristic algorithm 
that does this well. Our clustering algorithm has achieved good results on web pages, 
patent abstracts, newswires, meeting transcripts, and television transcripts with little or 
no customization in every case" (Page 03) and "Instead of producing a flat list of groups, 
Vivisimo organizes groups into a hierarchy or tree, using a well-known "Windows 
Explorer"-style interface. This interface can be used with no training since it is quite 
intuitive. Users can zoom in on items of interest while keeping an overview of all the 
search results" (Page 03), "Conceptual clustering methods interleave the process of 
forming groups with the step of annotating them, much like people might do by hand. 
So, if Vivisimo tries to form a group but judges that the group cannot be described well, 
the group is rejected. In contrast, some other approaches rely mainly on mathematical 
optimization, in which description of the groups is relegated to the end after the groups 
are formed, which gives generally worse results" (Page 03), and "We are gratified that 
users sometimes ask this. The annotations are created spontaneously by the software. 
When they are good, it seems that a human being must have created the categories 
and the machine merely recognizes the documents that belong there, which is not the 
case. However, our technology is not perfect: the diligent user will surely spot an 
occasional annotation that only a machine would make up" (Page 04) 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 
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Regarding claim 26, Bellegarda does not explicitly teach a computer-readable 
media comprising: 

A) wherein said labeling comprises selecting representative words based on the 
closeness of their vectors to the document vectors in a cluster. 

Vivisimo, however teaches "wherein said labeling comprises selecting 
representative words based on the closeness of their vectors to the document 
vectors in a cluster" as "Document clustering is the automatic organization of 
documents into groups or clusters. "Document clustering" differs from other techniques 
(classification, taxonomy building, Northern Light, etc.) in that it is fully automated: there 
is no human intervention at any point (except that people wrote the basic algorithms). 
The biggest challenge for document clustering has been to quickly find meaningful 
groups that are concisely annotated. Our innovation relies on a newly discovered 
heuristic algorithm that does this well. Our clustering algorithm has achieved good 
results on web pages, patent abstracts, newswires, meeting transcripts, and television 
transcripts with little or no customization in every case" (Page 03) and "Instead of 
producing a flat list of groups, Vivisimo organizes groups into a hierarchy or tree, using 
a well-known "Windows Explorer"-style interface. This interface can be used with no 
training since it is quite intuitive. Users can zoom in on items of interest while keeping 
an overview of all the search results" (Page 03), "Conceptual clustering methods 
interleave the process of forming groups with the step of annotating them, much like 
people might do by hand. So, if Vivisimo tries to form a group but judges that the group 
cannot be described well, the group is rejected. In contrast, some other approaches rely 
mainly on mathematical optimization, in which description of the groups is relegated to 
the end after the groups are formed, which gives generally worse results" (Page 03), 
and "We are gratified that users sometimes ask this. The annotations are created 
spontaneously by the software. When they are good, it seems that a human being must 
have created the categories and the machine merely recognizes the documents that 
belong there, which is not the case. However, our technology is not perfect: the diligent 
user will surely spot an occasional annotation that only a machine would make up" 
(Page 04) 
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It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Regarding claim 27, Bellegarda further teaches a computer-readable media 
comprising: 

A) wherein the computer executable code performs the following steps: clustering text 
files within the file system using semantic similarities (Page 1279, Abstract). 

The examiner notes that Bellegarda teaches "a semantic hierarchy that is 
based upon the content of said files" as "(discrete) words and documents are 
mapped onto a (continuous) semantic vector space, in which familiar clustering 
techniques can be applied. This leads to the specification of a powerful framework for 
automatic semantic classification, as well as the derivation of several language model 
families with various smoothing properties" (Page 1279, Abstract). 

Bellegarda does not explicitly teach: 

B) clustering non-text files within the files system using rule-based techniques; 

C) labeling the resulting clusters; and 

D) displaying the files in a hierarchical format based on the resulting clusters and 
labels. 

Vivisimo, however, teaches "clustering non-text files within the files system 
using rule-based techniques" as "Document clustering is the automatic organization 
of documents into groups or clusters. "Document clustering" differs from other 
techniques (classification, taxonomy building, Northern Light, etc.) in that it is fully 
automated: there is no human intervention at any point (except that people wrote the 
basic algorithms). The biggest challenge for document clustering has been to quickly 
find meaningful groups that are concisely annotated. Our innovation relies on a newly 
discovered heuristic algorithm that does this well. Our clustering algorithm has achieved 
good results on web pages, patent abstracts, newswires, meeting transcripts, and 
television transcripts with little or no customization in every case" (Page 03) and 
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"Instead of producing a flat list of groups, Vivfsimo organizes groups into a hierarchy or 
tree, using a well-known "Windows Explorer"-style interface. This interface can be used 
with no training since it is quite intuitive. Users can zoom in on items of interest while 
keeping an overview of all the search results" (Page 03), and "Conceptual clustering 
methods interleave the process of forming groups with the step of annotating them, 
much like people might do by hand. So, if Vivfsimo tries to form a group but judges that 
the group cannot be described well, the group is rejected. In contrast, some other 
approaches rely mainly on mathematical optimization, in which description of the groups 
is relegated to the end after the groups are formed, which gives generally worse results" 
(Page 03), "labeling the resulting clusters" as "Document clustering is the automatic 
organization of documents into groups or clusters. "Document clustering" differs from 
other techniques (classification, taxonomy building, Northern Light, etc.) in that it is fully 
automated: there is no human intervention at any point (except that people wrote the 
basic algorithms). The biggest challenge for document clustering has been to quickly 
find meaningful groups that are concisely annotated. Our innovation relies on a newly 
discovered heuristic algorithm that does this well. Our clustering algorithm has achieved 
good results on web pages, patent abstracts, newswires, meeting transcripts, and 
television transcripts with little or no customization in every case" (Page 03) and 
"Instead of producing a flat list of groups, Vivfsimo organizes groups into a hierarchy or 
tree, using a well-known "Windows Explorer"-style interface. This interface can be used 
with no training since it is quite intuitive. Users can zoom in on items of interest while 
keeping an overview of all the search results" (Page 03), "Conceptual clustering 
methods interleave the process of forming groups with the step of annotating them, 
much like people might do by hand. So, if Vivfsimo tries to form a group but judges that 
the group cannot be described well, the group is rejected. In contrast, some other 
approaches rely mainly on mathematical optimization, in which description of the groups 
is relegated to the end after the groups are formed, which gives generally worse results" 
(Page 03), and "We are gratified that users sometimes ask this. The annotations are 
created spontaneously by the software. When they are good, it seems that a human 
being must have created the categories and the machine merely recognizes the 
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documents that belong there, which is not the case. However, our technology is not 
perfect: the diligent user will surely spot an occasional annotation that only a machine 
would make up" (Page 04), and "displaying the files in a hierarchical format based 
on the resulting clusters and labels" as "Document clustering is the automatic 
organization of documents into groups or clusters. "Document clustering" differs from 
other techniques (classification, taxonomy building, Northern Light, etc.) in that it is fully 
automated: there is no human intervention at any point (except that people wrote the 
basic algorithms). The biggest challenge for document clustering has been to quickly 
find meaningful groups that are concisely annotated. Our innovation relies on a newly 
discovered heuristic algorithm that does this well. Our clustering algorithm has achieved 
good results on web pages, patent abstracts, newswires, meeting transcripts, and 
television transcripts with little or no customization in every case" (Page 03), "Instead of 
producing a flat list of groups, Vivisimo organizes groups into a hierarchy or tree, using 
a well-known "Windows Explorer"-style interface. This interface can be used with no 
training since it is quite intuitive. Users can zoom in on items of interest while keeping 
an overview of all the search results" (Page 03), "No. Simple one-word queries often 
lead to clusters that modify the query. For example, "soap" can lead to "soap opera", 
"handmade soap", and "soap bubbles", but also to "simple object access protocol", 
known also by its SOAP acronym" (Page 04), and "No. Sometimes a document fits well 
in more than one place in the hierarchy, so we place it everywhere it fits. For users, this 
is better than forcing documents to fit in a single location" (Page 04). 

The examiner further notes that the non-applied art of Arnold shows an interface 
of the Vivisimo search engine. Specifically, there is shown clusters of hierarchical 
folders that allow a user to drill down further if need be. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Regarding claim 28, Bellegarda teaches a computer system comprising: 
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A) a file system storing files (Page 1279, 1291, Abstract); 

C) a processor for analyzing the content of files stored in said file system to map said 
files into a semantic vector space, cluster the files within said space based on multiple 
threshold values that are settable to desired levels of granularity (Pages 1279 and 1284, 
Abstract); 

The examiner notes that Bellegarda teaches "a file system storing files" as 

"(discrete) words and documents are mapped onto a (continuous) semantic vector 
space, in which familiar clustering techniques can be applied. This leads to the 
specification of a powerful framework for automatic semantic classification, as well as 
the derivation of several language model families with various smoothing properties" 
(Page 1279, Abstract) and "The general domain considered was business news, as 
reflected in the WSJ portion of the NAB corpus" (Page 1291). The examiner further 
notes that Bellegarda teaches "a processor for analyzing the content of files stored 
in said file system to map said files into a semantic vector space, cluster the files 
within said space based on multiple threshold values that are settable to desired 
levels of granularity" as "(discrete) words and documents are mapped onto a 
(continuous) semantic vector space, in which familiar clustering techniques can be 
applied. This leads to the specification of a powerful framework for automatic semantic 
classification, as well as the derivation of several language model families with various 
smoothing properties" (Page 1279, Abstract) and "Once (11) is specified, it is 
straightforward to proceed with the clustering of the word vectors , using any of a variety 
of algorithms (see, for instance, [2]). Since the number of such vectors is relatively 
large, it is advisable to perform this clustering in stages, using, for example, K-means 
and bottom-up clustering sequentially. In that case, K-means clustering is used to obtain 
a coarse partition of the vocabulary in to a small set of superclusters. Each supercluster 
is then itself partitioned using bottom-up clustering, resulting in a final set of clusters Ck, 
1 <=k<=K, . This process can be thought of as uncovering, in a data-driven fashion, a 
particular layer of semantic knowledge in the space" (1284). 
Bellegarda does not explicitly teach: 

B) a display device; and 
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D) derive a hierarchy of plural levels of clusters from said clustering; 

E) a user interface which provides a user an option of displaying files stored in said file 
system in the form of said derived hierarchy of plural level of clusters. 

Vivisimo, however, teaches "a display device" as "Document clustering is the 
automatic organization of documents into groups or clusters. "Document clustering" 
differs from other techniques (classification, taxonomy building, Northern Light, etc.) in 
that it is fully automated: there is no human intervention at any point (except that people 
wrote the basic algorithms). The biggest challenge for document clustering has been to 
quickly find meaningful groups that are concisely annotated. Our innovation relies on a 
newly discovered heuristic algorithm that does this well. Our clustering algorithm has 
achieved good results on web pages, patent abstracts, newswires, meeting transcripts, 
and television transcripts with little or no customization in every case" (Page 03), 
"Instead of producing a flat list of groups, Vivisimo organizes groups into a hierarchy or 
tree, using a well-known "Windows Explorer"-style interface. This interface can be used 
with no training since it is quite intuitive. Users can zoom in on items of interest while 
keeping an overview of all the search results" (Page 03), "No. Simple one-word queries 
often lead to clusters that modify the query. For example, "soap" can lead to "soap 
opera", "handmade soap", and "soap bubbles", but also to "simple object access 
protocol", known also by its SOAP acronym" (Page 04), and "No. Sometimes a 
document fits well in more than one place in the hierarchy, so we place it everywhere it 
fits. For users, this is better than forcing documents to fit in a single location" (Page 04), 
"a user interface which provides a user an option of displaying files stored in said 
file system in the form of said derived hierarchy of plural level of clusters" as 
"Document clustering is the automatic organization of documents into groups or 
clusters. "Document clustering" differs from other techniques (classification, taxonomy 
building, Northern Light, etc.) in that it is fully automated: there is no human intervention 
at any point (except that people wrote the basic algorithms). The biggest challenge for 
document clustering has been to quickly find meaningful groups that are concisely 
annotated. Our innovation relies on a newly discovered heuristic algorithm that does this 
well. Our clustering algorithm has achieved good results on web pages, patent 
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abstracts, newswires, meeting transcripts, and television transcripts with little or no 
customization in every case" (Page 03), "Instead of producing a flat list of groups, 
Vivisimo organizes groups into a hierarchy or tree, using a well-known "Windows 
Explorer"-style interface. This interface can be used with no training since it is quite 
intuitive. Users can zoom in on items of interest while keeping an overview of all the 
search results" (Page 03), "No. Simple one-word queries often lead to clusters that 
modify the query. For example, "soap" can lead to "soap opera", "handmade soap", and 
"soap bubbles", but also to "simple object access protocol", known also by its SOAP 
acronym" (Page 04), and "No. Sometimes a document fits well in more than one place 
in the hierarchy, so we place it everywhere it fits. For users, this is better than forcing 
documents to fit in a single location" (Page 04), and "a user interface which displays 
representations of files stored in said file system in the form of said derived 
hierarchy of plural level of clusters" as "Document clustering is the automatic 
organization of documents into groups or clusters. "Document clustering" differs from 
other techniques (classification, taxonomy building, Northern Light, etc.) in that it is fully 
automated: there is no human intervention at any point (except that people wrote the 
basic algorithms). The biggest challenge for document clustering has been to quickly 
find meaningful groups that are concisely annotated. Our innovation relies on a newly 
discovered heuristic algorithm that does this well. Our clustering algorithm has achieved 
good results on web pages, patent abstracts, newswires, meeting transcripts, and 
television transcripts with little or no customization in every case" (Page 03), "Instead of 
producing a flat list of groups, Vivisimo organizes groups into a hierarchy or tree, using 
a well-known "Windows Explorer"-style interface. This interface can be used with no 
training since it is quite intuitive. Users can zoom in on items of interest while keeping 
an overview of all the search results" (Page 03), "No. Simple one-word queries often 
lead to clusters that modify the query. For example, "soap" can lead to "soap opera", 
"handmade soap", and "soap bubbles", but also to "simple object access protocol", 
known also by its SOAP acronym" (Page 04), and "No. Sometimes a document fits well 
in more than one place in the hierarchy, so we place it everywhere it fits. For users, this 
is better than forcing documents to fit in a single location" (Page 04). 
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The examiner further notes that the non-applied art of Arnold shows an interface 
of the Vivisimo search engine. Specifically, there is shown clusters of hierarchical 
folders that allow a user to drill down further if need be. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Bellegarda and Vivisimo do not explicitly teach: 
E) a user interface which provides a user an option of displaying the files in a 
hierarchical format based on locations of the files in the file system . 

Moore, however, teaches "a user interface which provides a user an option 
of displaying the files in a hierarchical format based on locations of the files in 
the file system " as "FIG. 5 is a tree diagram of a folder structure in accordance with a 
physical folder arrangement on a hard drive. This physical folder arrangement is based 
on the traditional implementation of folders, which may be based on NTFS or other 
existing file systems. Such folders are referred to as physical folders because their 
structuring is based on the actual physical underlying file system structure on the disk. 
As will be described in more detail below, this is in contrast to virtual folders, which 
create location-independent views that allow users to manipulate files and folders in 
ways that are similar to those currently used for manipulating physical folders" 
(Paragraph 95) and "FIG. 17 is a diagram illustrative of a screen display in which a 
quick link for physical folders is selected. The selection box SB is shown to be around 
the "all folders" quick link 616. As will be described in more detail below with respect to 
FIG. 1 8, the "all folders" quick link 616 provides for switching to a view of physical 
folders. FIG. 18 is a diagram illustrative of a screen display showing physical folders. 
The physical folders that are shown contain the files of the virtual folder stacks of FIG. 
17. In other words, the items contained within the stacks 651-655 of FIG. 17 are also 
contained in certain physical folders in the system. These are shown in FIG. 18 as a 
"My Documents" folder 851 that is located on the present computer, a "Desktop" folder 
852 that is located on the present computer, a "Foo" folder 853 that is located on the 
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hard drive C:, a "My Files" folder 854 that is located on a server, an "External Drive" 
folder 855 that is located on an external drive, a "My Documents" folder 856 that is 
located on another computer, and a "Desktop" folder 857 that is located on another 
computer" (Paragraphs 115-116). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Moore's would have allowed Bellegarda's and Vivisimo's to provide users the ability 
to toggle between virtual folder representations and physical folder representations 
based on their desires, as noted by Moore (Paragraph 117). 

Regarding claim 30, Bellegarda further teaches a computer system comprising: 

A) wherein said files are text documents (Page 1279, Abstract); and 

B) said processor maps said files on the basis of a language model (Page 1279, 
Abstract). 

The examiner notes that Bellegarda teaches "wherein said files are text 

documents" as "This paper focuses on the use of latent semantic analysis, a paradigm 
that automatically uncovers the salient semantic relationships between words and 
documents in a given corpus" (Page 1279, Abstract). The examiner further notes that 
Bellegarda teaches "said processor maps said files on the basis of a language 
model" as "(discrete) words and documents are mapped onto a (continuous) semantic 
vector space, in which familiar clustering techniques can be applied. This leads to the 
specification of a powerful framework for automatic semantic classification, as well as 
the derivation of several language model families with various smoothing properties" 
(Page 1279, Abstract). 

Regarding claim 31 , Bellegarda further teaches a computer system comprising: 
A) wherein said processor constructs a matrix which associates each word in the 
documents with a vector (Page 1281, Section: A. Feature Extraction, Section: B. 
Singular Value Decomposition); and 



Application/Control Number: 1 0/644,81 5 Page 36 

Art Unit: 2168 

B) associates each document with a vector (Page 1281 , Section: A. Feature 
Extraction, Section: B. Singular Value Decomposition). 

The examiner notes that Bellegarda teaches "wherein said processor 
constructs a matrix which associates each word in the documents with a vector" 

as "The starting point is the construction of a matrix {W) of co-occurrences between 
words and documents" (Page 1281, Section: A. Feature Extraction) and "The (Mx N) 
word-document matrix W resulting from the above feature extraction defines two vector 
representations for the words and the documents. Each word co/ can be uniquely 
associated with a row vector of dimension N, and each document dj can be uniquely 
associated with a column vector of dimension M (Page 1 281 , Section: B. Singular 
Value Decomposition). The examiner further notes that Bellegarda teaches 
"associates each document with a vector" as "The (M x N) word-document matrix W 
resulting from the above feature extraction defines two vector representations for the 
words and the documents. Each word «/ can be uniquely associated with a row vector 
of dimension N, and each document c/ ; can be uniquely associated with a column vector 
of dimension M" (Page 1281, Section: B. Singular Value Decomposition). 

Regarding claim 32, Bellegarda further teaches a computer-readable media 
comprising: 

A) wherein said processor further decomposes said matrix to define the words and 
documents as vectors in a continuous vector space (Page 1281 , Section: A. Feature 
Extraction, Section: B. Singular Value Decomposition). 

The examiner notes that Bellegarda teaches "wherein said processor further 
decomposes said matrix to define the words and documents as vectors in a 
continuous vector space" as "To address these issues, it is useful to employ a 
singular value decomposition (SVD), a technique closely related to eigenvector 
decomposition and factor analysis" (Page 1281, Section: B. Singular Value 
Decomposition). 
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Regarding claim 33, Bellegarda further teaches a computer system comprising: 
A) wherein said processor clusters the files by identifying documents whose vectors are 
within a threshold distance of one another (Page 1284, Section: A. Word Clustering). 

The examiner notes that Bellegarda teaches "wherein said processor 
clusters the files by identifying documents whose vectors are within a threshold 
distance of one another" as "This opens up the opportunity to apply familiar clustering 
techniques in S, as long as a distance measure consistent with the SVD formalism is 
defined on the vector space" (Page 1286, Section: A. Framework Extension). 

Regarding claim 35, Bellegarda does not explicitly teach a computer system 
comprising: 

A) wherein said processor automatically labels the clusters based on the resulting 
clusters. 

Vivisimo, however teaches "wherein said processor automatically labels the 
clusters based on the resulting clusters" as "Document clustering is the automatic 
organization of documents into groups or clusters. "Document clustering" differs from 
other techniques (classification, taxonomy building, Northern Light, etc.) in that it is fully 
automated: there is no human intervention at any point (except that people wrote the 
basic algorithms). The biggest challenge for document clustering has been to quickly 
find meaningful groups that are concisely annotated. Our innovation relies on a newly 
discovered heuristic algorithm that does this well. Our clustering algorithm has achieved 
good results on web pages, patent abstracts, newswires, meeting transcripts, and 
television transcripts with little or no customization in every case" (Page 03) and 
"Instead of producing a flat list of groups, Vivisimo organizes groups into a hierarchy or 
tree, using a well-known "Windows Explorer"-style interface. This interface can be used 
with no training since it is quite intuitive. Users can zoom in on items of interest while 
keeping an overview of all the search results" (Page 03), "Conceptual clustering 
methods interleave the process of forming groups with the step of annotating them, 
much like people might do by hand. So, if Vivisimo tries to form a group but judges that 
the group cannot be described well, the group is rejected. In contrast, some other 
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approaches rely mainly on mathematical optimization, in which description of the groups 
is relegated to the end after the groups are formed, which gives generally worse results" 
(Page 03), and "We are gratified that users sometimes ask this. The annotations are 
created spontaneously by the software. When they are good, it seems that a human 
being must have created the categories and the machine merely recognizes the 
documents that belong there, which is not the case. However, our technology is not 
perfect: the diligent user will surely spot an occasional annotation that only a machine 
would make up" (Page 04) 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Regarding claim 36, Bellegarda does not explicitly teach a computer system 
comprising: 

A) wherein said processor labels the clusters by selecting representative words based 
on the closeness of their vectors to the document vectors in a cluster. 

Vivisimo, however teaches "wherein said processor labels the clusters by 
selecting representative words based on the closeness of their vectors to the 
document vectors in a cluster" as "Document clustering is the automatic organization 
of documents into groups or clusters. "Document clustering" differs from other 
techniques (classification, taxonomy building, Northern Light, etc.) in that it is fully 
automated: there is no human intervention at any point (except that people wrote the 
basic algorithms). The biggest challenge for document clustering has been to quickly 
find meaningful groups that are concisely annotated. Our innovation relies on a newly 
discovered heuristic algorithm that does this well. Our clustering algorithm has achieved 
good results on web pages, patent abstracts, newswires, meeting transcripts, and 
television transcripts with little or no customization in every case" (Page 03) and 
"Instead of producing a flat list of groups, Vivisimo organizes groups into a hierarchy or 
tree, using a well-known "Windows Explorer"-style interface. This interface can be used 
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with no training since it is quite intuitive. Users can zoom in on items of interest while 
keeping an overview of all the search results" (Page 03), "Conceptual clustering 
methods interleave the process of forming groups with the step of annotating them, 
much like people might do by hand. So, if Vivisimo tries to form a group but judges that 
the group cannot be described well, the group is rejected. In contrast, some other 
approaches rely mainly on mathematical optimization, in which description of the groups 
is relegated to the end after the groups are formed, which gives generally worse results" 
(Page 03), and "We are gratified that users sometimes ask this. The annotations are 
created spontaneously by the software. When they are good, it seems that a human 
being must have created the categories and the machine merely recognizes the 
documents that belong there, which is not the case. However, our technology is not 
perfect: the diligent user will surely spot an occasional annotation that only a machine 
would make up" (Page 04) 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Regarding claim 37, Bellegarda does not explicitly teach a method comprising: 
A) wherein said deriving step includes organizing the clusters into a hierarchical 
directory structure. 

Vivisimo, however, teaches "wherein said deriving step includes organizing 
the clusters into a hierarchical directory structure" as "Document clustering is the 
automatic organization of documents into groups or clusters. "Document clustering" 
differs from other techniques (classification, taxonomy building, Northern Light, etc.) in 
that it is fully automated: there is no human intervention at any point (except that people 
wrote the basic algorithms). The biggest challenge for document clustering has been to 
quickly find meaningful groups that are concisely annotated. Our innovation relies on a 
newly discovered heuristic algorithm that does this well. Our clustering algorithm has 
achieved good results on web pages, patent abstracts, newswires, meeting transcripts, 
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and television transcripts with little or no customization in every case" (Page 03), 
"Instead of producing a flat list of groups, Vivisimo organizes groups into a hierarchy or 
tree, using a well-known "Windows Explorer"-style interface. This interface can be used 
with no training since it is quite intuitive. Users can zoom in on items of interest while 
keeping an overview of all the search results" (Page 03), "No. Simple one-word queries 
often lead to clusters that modify the query. For example, "soap" can lead to "soap 
opera", "handmade soap", and "soap bubbles", but also to "simple object access 
protocol", known also by its SOAP acronym" (Page 04), and "No. Sometimes a 
document fits well in more than one place in the hierarchy, so we place it everywhere it 
fits. For users, this is better than forcing documents to fit in a single location" (Page 04). 

The examiner further notes that the non-applied art of Arnold shows an interface 
of the Vivisimo search engine. Specifically, there is shown clusters of hierarchical 
folders that allow a user to drill down further if need be. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Regarding claim 38, Bellegarda teaches a method comprising: 

A) mapping all words of the plurality of documents in the file system and the plurality of 
documents in a semantic vector space (Pages 1279, 1291 , Abstract); 

B) generating a plurality of clusters based on the semantic similarities of the plurality of 
documents and multiple threshold values that are settable to desired levels of 
granularity (Pages 1279 and 1284, Abstract). 

The examiner notes that Bellegarda teaches "mapping all words of the 
plurality of documents in the file system and the plurality of documents in a 
semantic vector space" as "(discrete) words and documents are mapped onto a 
(continuous) semantic vector space, in which familiar clustering techniques can be 
applied. This leads to the specification of a powerful framework for automatic semantic 
classification, as well as the derivation of several language model families with various 
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smoothing properties" (Page 1279, Abstract) and "The general domain considered was 
business news, as reflected in the WSJ portion of the NAB corpus" (Page 1291). The 
examiner further notes that Bellegarda teaches "generating a plurality of clusters 
based on the semantic similarities of the plurality of documents and multiple 
threshold values that are settable to desired levels of granularity" as "(discrete) 
words and documents are mapped onto a (continuous) semantic vector space, in which 
familiar clustering techniques can be applied. This leads to the specification of a 
powerful framework for automatic semantic classification, as well as the derivation of 
several language model families with various smoothing properties" (Page 1279, 
Abstract) and "Once (1 1) is specified, it is straightforward to proceed with the clustering 
of the word vectors , using any of a variety of algorithms (see, for instance, [2]). Since 
the number of such vectors is relatively large, it is advisable to perform this clustering in 
stages, using, for example, K-means and bottom-up clustering sequentially. In that 
case, K-means clustering is used to obtain a coarse partition of the vocabulary in to a 
small set of superclusters. Each supercluster is then itself partitioned using bottom-up 
clustering, resulting in a final set of clusters Ck, 1 <=k<=K, . This process can be thought 
of as uncovering, in a data-driven fashion, a particular layer of semantic knowledge in 
the space" (1284). 

Bellegarda does not explicitly teach: 

C) organizing the plurality of clusters into directories in a hierarchical format of plural 
levels of clusters; 

D) providing a user an option of displaying the plurality of documents in said 
hierarchical format of plural levels of clusters based on a result of clustering the plurality 
of documents. 

Vivisimo, however, teaches "organizing the plurality of clusters into 
directories in a hierarchical format of plural levels of clusters" as "Document 
clustering is the automatic organization of documents into groups or clusters. 
"Document clustering" differs from other techniques (classification, taxonomy building, 
Northern Light, etc.) in that it is fully automated: there is no human intervention at any 
point (except that people wrote the basic algorithms). The biggest challenge for 
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document clustering has been to quickly find meaningful groups that are concisely 
annotated. Our innovation relies on a newly discovered heuristic algorithm that does this 
well. Our clustering algorithm has achieved good results on web pages, patent 
abstracts, newswires, meeting transcripts, and television transcripts with little or no 
customization in every case" (Page 03), "Instead of producing a flat list of groups, 
Vivisimo organizes groups into a hierarchy or tree, using a well-known "Windows 
Explorer"-style interface. This interface can be used with no training since it is quite 
intuitive. Users can zoom in on items of interest while keeping an overview of all the 
search results" (Page 03), "No. Simple one-word queries often lead to clusters that 
modify the query. For example, "soap" can lead to "soap opera", "handmade soap", and 
"soap bubbles", but also to "simple object access protocol", known also by its SOAP 
acronym" (Page 04), and "No. Sometimes a document fits well in more than one place 
in the hierarchy, so we place it everywhere it fits. For users, this is better than forcing 
documents to fit in a single location" (Page 04), and " providing a user an option of 
displaying the plurality of documents in said hierarchical format of plural levels of 
clusters based on a result of clustering the plurality of documents" as "Document 
clustering is the automatic organization of documents into groups or clusters. 
"Document clustering" differs from other techniques (classification, taxonomy building, 
Northern Light, etc.) in that it is fully automated: there is no human intervention at any 
point (except that people wrote the basic algorithms). The biggest challenge for 
document clustering has been to quickly find meaningful groups that are concisely 
annotated. Our innovation relies on a newly discovered heuristic algorithm that does this 
well. Our clustering algorithm has achieved good results on web pages, patent 
abstracts, newswires, meeting transcripts, and television transcripts with little or no 
customization in every case" (Page 03), "Instead of producing a flat list of groups, 
Vivisimo organizes groups into a hierarchy or tree, using a well-known "Windows 
Explorer"-style interface. This interface can be used with no training since it is quite 
intuitive. Users can zoom in on items of interest while keeping an overview of all the 
search results" (Page 03), "No. Simple one-word queries often lead to clusters that 
modify the query. For example, "soap" can lead to "soap opera", "handmade soap", and 
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"soap bubbles", but also to "simple object access protocol", known also by its SOAP 
acronym" (Page 04), and "No. Sometimes a document fits well in more than one place 
in the hierarchy, so we place it everywhere it fits. For users, this is better than forcing 
documents to fit in a single location" (Page 04). 

The examiner further notes that the non-applied art of Arnold shows an interface 
of the Vivisimo search engine. Specifically, there is shown clusters of hierarchical 
folders that allow a user to drill down further if need be. 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

Bellegarda and Vivisimo do not explicitly teach: 
D) providing a user an option of displaying the documents in a hierarchical format 
based on locations of the documents in the file system . 

Moore, however, teaches " providing a user an option of displaying the 
documents in a hierarchical format based on locations of the documents in the 
file system " as "FIG. 5 is a tree diagram of a folder structure in accordance with a 
physical folder arrangement on a hard drive. This physical folder arrangement is based 
on the traditional implementation of folders, which may be based on NTFS or other 
existing file systems. Such folders are referred to as physical folders because their 
structuring is based on the actual physical underlying file system structure on the disk. 
As will be described in more detail below, this is in contrast to virtual folders, which 
create location-independent views that allow users to manipulate files and folders in 
ways that are similar to those currently used for manipulating physical folders" 
(Paragraph 95) and "FIG. 17 is a diagram illustrative of a screen display in which a 
quick link for physical folders is selected. The selection box SB is shown to be around 
the "all folders" quick link 616. As will be described in more detail below with respect to 
FIG. 18, the "all folders" quick link 616 provides for switching to a view of physical 
folders. FIG. 18 is a diagram illustrative of a screen display showing physical folders. 
The physical folders that are shown contain the files of the virtual folder stacks of FIG. 
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17. In other words, the items contained within the stacks 651-655 of FIG. 17 are also 
contained in certain physical folders in the system. These are shown in FIG. 18 as a 
"My Documents" folder 851 that is located on the present computer, a "Desktop" folder 
852 that is located on the present computer, a "Foo" folder 853 that is located on the 
hard drive C:, a "My Files" folder 854 that is located on a server, an "External Drive" 
folder 855 that is located on an external drive, a "My Documents" folder 856 that is 
located on another computer, and a "Desktop" folder 857 that is located on another 
computer" (Paragraphs 115-116). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Moore's would have allowed Bellegarda's and Vivisimo's to provide users the ability 
to toggle between virtual folder representations and physical folder representations 
based on their desires, as noted by Moore (Paragraph 117). 

Regarding claim 48, Bellegarda further teaches a method comprising: 
A) wherein the multiple threshold values are characteristic values of clusters from said 
clustering (Page 1284). 

The examiner notes that Bellegarda teaches "wherein the multiple threshold 
values are characteristic values of clusters from said clustering" as "Once (1 1 ) is 
specified, it is straightforward to proceed with the clustering of the word vectors , using 
any of a variety of algorithms (see, for instance, [2]). Since the number of such vectors 
is relatively large, it is advisable to perform this clustering in stages, using, for example, 
K-means and bottom-up clustering sequentially. In that case, K-means clustering is 
used to obtain a coarse partition of the vocabulary in to a small set of superclusters. 
Each supercluster is then itself partitioned using bottom-up clustering, resulting in a final 
set of clusters Ck, 1 <=k<=K, . This process can be thought of as uncovering, in a data- 
driven fashion, a particular layer of semantic knowledge in the space" (1284). 

Regarding claim 50, Bellegarda further teaches a graphical user interface 
comprising: 
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A) wherein the multiple threshold values are characteristic values of clusters from said 
clustering (Page 1284). 

The examiner notes that Bellegarda teaches "wherein the multiple threshold 
values are characteristic values of clusters from said clustering" as "Once (1 1 ) is 
specified, it is straightforward to proceed with the clustering of the word vectors , using 
any of a variety of algorithms (see, for instance, [2]). Since the number of such vectors 
is relatively large, it is advisable to perform this clustering in stages, using, for example, 
K-means and bottom-up clustering sequentially. In that case, K-means clustering is 
used to obtain a coarse partition of the vocabulary in to a small set of superclusters. 
Each supercluster is then itself partitioned using bottom-up clustering, resulting in a final 
set of clusters Ck, 1 <=k<=K, . This process can be thought of as uncovering, in a data- 
driven fashion, a particular layer of semantic knowledge in the space" (1284). 

Regarding claim 52, Bellegarda further teaches a computer readable media 
comprising: 

A) wherein the multiple threshold values are characteristic values of clusters from said 
clustering (Page 1284). 

The examiner notes that Bellegarda teaches "wherein the multiple threshold 
values are characteristic values of clusters from said clustering" as "Once (1 1 ) is 
specified, it is straightforward to proceed with the clustering of the word vectors , using 
any of a variety of algorithms (see, for instance, [2]). Since the number of such vectors 
is relatively large, it is advisable to perform this clustering in stages, using, for example, 
K-means and bottom-up clustering sequentially. In that case, K-means clustering is 
used to obtain a coarse partition of the vocabulary in to a small set of superclusters. 
Each supercluster is then itself partitioned using bottom-up clustering, resulting in a final 
set of clusters Ck, 1 <=k<=K, . This process can be thought of as uncovering, in a data- 
driven fashion, a particular layer of semantic knowledge in the space" (1284). 

Regarding claim 54, Bellegarda further teaches a computer system comprising: 
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A) wherein the multiple threshold values are characteristic values of clusters from said 
clustering (Page 1284). 

The examiner notes that Bellegarda teaches "wherein the multiple threshold 
values are characteristic values of clusters from said clustering" as "Once (1 1 ) is 
specified, it is straightforward to proceed with the clustering of the word vectors , using 
any of a variety of algorithms (see, for instance, [2]). Since the number of such vectors 
is relatively large, it is advisable to perform this clustering in stages, using, for example, 
K-means and bottom-up clustering sequentially. In that case, K-means clustering is 
used to obtain a coarse partition of the vocabulary in to a small set of superclusters. 
Each supercluster is then itself partitioned using bottom-up clustering, resulting in a final 
set of clusters Ck, 1 <=k<=K, . This process can be thought of as uncovering, in a data- 
driven fashion, a particular layer of semantic knowledge in the space" (1284). 

Regarding claim 56, Bellegarda further teaches a method comprising: 
A) wherein the multiple threshold values are characteristic values of clusters from said 
clustering (Page 1284). 

The examiner notes that Bellegarda teaches "wherein the multiple threshold 
values are characteristic values of clusters from said clustering" as "Once (1 1) is 
specified, it is straightforward to proceed with the clustering of the word vectors , using 
any of a variety of algorithms (see, for instance, [2]). Since the number of such vectors 
is relatively large, it is advisable to perform this clustering in stages, using, for example, 
K-means and bottom-up clustering sequentially. In that case, K-means clustering is 
used to obtain a coarse partition of the vocabulary in to a small set of superclusters. 
Each supercluster is then itself partitioned using bottom-up clustering, resulting in a final 
set of clusters Ck, 1 <=k<=K, . This process can be thought of as uncovering, in a data- 
driven fashion, a particular layer of semantic knowledge in the space" (1284). 

Regarding claim 58, Bellegarda does not explicitly teach a method comprising: 
A) providing a user an option to reorganize the files in the file system according to the 
derived hierarchy. 
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Vivisimo, however, teaches "providing a user an option to reorganize the 
files in the file system according to the derived hierarchy" as "No. Sometimes a 
document fits well in more than one place in the hierarchy, so we place it everywhere it 
fits. For users, this is better than forcing documents to fit in a single location" (Page 04). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Vivisimo's would have allowed Bellegarda's to provide a clustering that is user 
friendly, concise, and fast, as noted by Vivisimo (Page 04). 

8. Claims 49, 51, 53, 55, and 57 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Bellegarda et al. (Article entitled "Exploiting Latent Semantic 
Information in Statistical Language Modeling, dated 10/26/2000) and in view of 
Vivisimo (Article entitled "Vivisimo FAQ", dated 02/04/2002), and further in view of 
Moore et al. (U.S. PGPUB 2004/01 93621 ), as applied to claims 1 -7, 9-11,1 3-23, 25- 
28, 30-33, 35-38, 48, 52, 54, 56, and 58 above, and further in view of Hertz (U.S. 
PGPUB 2003/0037041). 

9. Regarding claims, 49, 51 , 53, 55, and 57, Bellegarda, Vivisimo, and Moore do 

not explicitly teach a method, graphical user interface, computer-readable media, 
computer system, and computer comprising: 

A) wherein the characteristic values of the clusters are cluster variances of the clusters. 

Hertz, however, teaches "wherein the characteristic values of the clusters 
are cluster variances of the clusters" as "a real number determined by calculating the 
statistical variance of the profiles of all target objects in a cluster, is termed a "cluster 
variance,"" (Paragraph 13) and "The threshold used in step 6 is typically an affine 
function or other function of the greater of the cluster variances (or cluster diameters) of 
S and T" (Paragraph 326). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because teaching 
Hertz's would have allowed Bellegarda's, Vivisimo's, and Moore's to provide for a 
more efficient method in gathering data that interests users, as noted by Hertz 
(Paragraph 11). 
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Response to Arguments 

1 0. Applicant's arguments with respect to claims 1 -7, 9-11,1 3-23, 25-28, 30-33, 35- 
38, and 48- 58, have been considered but are moot in view of the new ground(s) of 
rejection (See newly cited art of Moore). 

Applicant's arguments filed 09/13/2010 have been fully considered but they are 
not persuasive. 

Applicants argue on page 13 that "Bellegarda and Vivisimo, whether 
considered individually or in combination, do not disclose a method of displaying 
files within a file system that includes mapping the files in the file system into a 
semantic vector space, deriving a hierarchy based on clustering within the vector 
space". However, the examiner wishes to refer to Bellegarda which states "(discrete) 
words and documents are mapped onto a (continuous) semantic vector space, in which 
familiar clustering techniques can be applied. This leads to the specification of a 
powerful framework for automatic semantic classification, as well as the derivation of 
several language model families with various smoothing properties" (Page 1279, 
Abstract) and "The general domain considered was business news, as reflected in the 
WSJ portion of the NAB corpus" (Page 1 291 ). The examiner further wishes to state that 
it is clear that files from a file system were mapped into the semantic vector space of 
Bellegarda. Moreover, Vivisomo teaches the claimed deriving of a hierarchy. 

Conclusion 

1 1 . The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

U.S. Patent 6,820,094 issued to Ferguson et al. on 16 November 2004. The 
subject matter disclosed therein is pertinent to that of claims 1 -7, 9-11,1 3-23, 25-28, 
30-33, 35-38, and 48- 58 (e.g., methods to use to smart folders to automatically 
organize and relate relevant files). 

U.S. Patent 5,819,258 issued to Vaithyanathan et al. on 06 October 1998. The 
subject matter disclosed therein is pertinent to that of claims 1 -7, 9-11,1 3-23, 25-28, 
30-33, 35-38, and 48- 58 (e.g., methods to use to smart folders to automatically 
organize and relate relevant files). 
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U.S. Patent 6,360,227 issued to Aggarwal et al. on 19 March 2002. The subject 
matter disclosed therein is pertinent to that of claims 1 -7, 9-11,1 3-23, 25-28, 30-33, 35- 
38, and 48- 58 (e.g., methods to use to smart folders to automatically organize and 
relate relevant files). 

U.S. Patent 5,899,995 issued to Millier et al. on 04 May 1999. The subject 
matter disclosed therein is pertinent to that of claims 1-7, 9-1 1 , 13-23, 25-28, 30-33, 35- 
38, and 48- 58 (e.g., methods to use to smart folders to automatically organize and 
relate relevant files). 

U.S. Patent 7,158,986 issued to Oliver et al. on 02 January 2007. The subject 
matter disclosed therein is pertinent to that of claims 1 -7, 9-11,1 3-23, 25-28, 30-33, 35- 
38, and 48- 58 (e.g., methods to use to smart folders to automatically organize and 
relate relevant files). 

U.S. Patent 7,085,767 issued to Kusama on 01 August 2006. The subject 
matter disclosed therein is pertinent to that of claims 1-7, 9-1 1 , 13-23, 25-28, 30-33, 35- 
38, and 48- 58 (e.g., methods to use to smart folders to automatically organize and 
relate relevant files). 

U.S. PGPUB 2004/0249865 issued to Lee et al. on 09 December 2004. The 
subject matter disclosed therein is pertinent to that of claims 1 -7, 9-11,1 3-23, 25-28, 
30-33, 35-38, and 48- 58 (e.g., methods to automatically name and label folders). 

U.S. PGPUB 2004/0148453 issued to Watanabe et al. on 29 July 2004. The 
subject matter disclosed therein is pertinent to that of claims 1 -7, 9-11,1 3-23, 25-28, 
30-33, 35-38, and 48- 58 (e.g., methods to automatically name and label folders). 

Article entitled "Vivisimo: Clustering Delivers Information Overlook", dated 
05/03/2003, by Arnold. The subject matter disclosed therein is pertinent to that of 
claims 1-7, 9-11, 13-23, 25-28, 30-33, 35-38, and 48- 58 (e.g., methods to use to smart 
folders to automatically organize and relate relevant files). 

Contact Information 
12. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Mahesh Dwivedi whose telephone number is (571) 272- 
2731 . The examiner can normally be reached on Monday to Friday 8:20 am - 4:40 pm. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Tim Vo can be reached (571) 272-3642. The fax number for the 
organization where this application or proceeding is assigned is (571) 273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.iis pto.gov . Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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